Biostatistics For Dummies (Monika Wahi John Pezzullo)

ratio of variances

0.9706621

As shown in Listing 11-4, the p value on the F test is 0.4684. As a rule of thumb:

, you would assume equal variances.

, you would assume unequal variances.

In this case, because the p value is greater than 0.05, equal variances can be assumed, and these data

would qualify for the classic Student t test. As described earlier, R gets around this by always using

the Welch’s t test, which accommodates both unequal and equal variances.

Assessing the ANOVA

In this section, we present the basic concepts underlying the analysis of variance (ANOVA), which

compares the means of three or more groups. We also describe some of the more popular post-hoc

tests used to follow a statistically significant ANOVA. Finally, we show you how to run commands to

execute an ANOVA and post-hoc tests in R, and interpret the output.

Grasping how the ANOVA works

As described earlier in “Surveying Student t tests,” it is only possible to run a t test on two groups.

This is why we demonstrated the t test comparing married NHANES participants (M) to all other

marital statuses (OTH). We were testing the null hypothesis M – OTH = 0 because we were only

allowed to compare two groups! So when comparing three groups, such as married (M), never married

(NM), and all others (OTH), it’s natural to think of pairing up the groups and running three t tests

(meaning testing M – NM, then testing M – OTH, then testing NM – OTH). But running an exhaustive

set of two-group t tests increases the likelihood of Type I error, which is where you get a statistically

significant comparison that is just by chance (for a review, read Chapter 3). And this is just with three

groups!

The general rule is that N groups can be paired up in

different ways, so in a

study with six groups, you’d have

, or 15 two-group comparisons, which is way too

many.

The term one-way ANOVA refers to an ANOVA with only one grouping variable in it. The grouping

variable usually has three or more levels because if it has only two, most analysts just do a t test. In an

ANOVA, you are testing how spread out the means of the various levels are from each other. It is not

unusual for students to be asked to calculate an ANOVA manually in a statistics class, but we skip that

here and just describe the result. One result derived from an ANOVA calculation is expressed in a test

statistic called the F ratio (designated simply as F). The F is the ratio of how much variability there is

between the groups relative to how much variability there is within the groups. If the null hypothesis is

true, and no true difference exists between the groups (meaning the average fasting glucose in M = NM

= OTH), then the F ratio should be close to 1. Also, F’s sampling fluctuations should follow the

Fisher F distribution (see Chapter 24), which is actually a family of distribution functions

characterized by the following two numbers seen in the ANOVA calculation:

The numerator degrees of freedom: This number is often designated as

, which is one